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In this paper, we consider an estimation problem of the regression coefficients in multiple regres¬ 
sion models with several unknown change-points. Under some realistic assumptions, we propose 
a class of estimators which includes as a special cases shrinkage estimators (SEs) as well as the 
unrestricted estimator (UE) and the restricted estimator (RE). We also derive a more general 
condition for the SEs to dominate the UE. To this end, we generalize some identities for the 
evaluation of the bias and risk functions of shrinkage-type estimators. As illustrative example, 
our method is applied to the “gross domestic product” data set of 10 countries whose USA, 
Canada, UK, France and Germany. The simulation results corroborate our theoretical findings. 
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1. Introduction 

In this paper, we study the multivariate regression models with multiple change-points 
occurring at unknown times. The target parameters are the regression coefficients while 
the unknown change points are treated as nuisance parameters. More specifically, we are 
interested in scenario where imprecise prior information about the regression coefficients 
is available, that is, the target parameters may satisfy some restrictions. 

The importance of change-points’ model in literature is a primary source of our moti¬ 
vation. Indeed, the regression model with change-points has been applied in many fields. 
For example, this model was used in Broemeling and Tsurumi [4] for the US demand for 
money, as well as in Lombard [11] for the effect of sudden changes in wind direction of 
the flight of a projectile. It was also analyse the DNA sequences (see, e.g., Braun and 
Muller [3] and Fu and Curnow [5, 6]). To give some recent references, we quote Bai and 
Perron [I], Zeileis et al. [20], Perron and Qu [16] among others. 

More specifically, the method in Perron and Qu [16] is based on a global least squares 
procedure. Generally, when the restriction holds, the restricted estimator (RE) dominates 
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the unrestricted estimator (UE). However, it is well known that the RE may performs 
poorly when the restrictions is seriously violated. 

Over the years, shrinkage estimation has become a useful tool in deriving the method 
which combines in optimal way both imprecise prior knowledge from a hypothesized 
restriction and the sample information. For more details about such a technique, we refer 
to James and Stein [8], Baranchick [2], Judge and Bock [9], and the references therein. 
Also, to give some recent contributions about shrinkage methods, we quote Saleh [18], 
Nkurunziza and Ahmed [15], Nkurunziza [13] and Tan [19], among others. 

To the best of our knowledge, in context of multiple regression model with unknown 
changes-points, shrinkage method has received, so far, less attention. Thus, we hope to 
fill this gap by developing a class of shrinkage-type estimators which includes as special 
cases the UE, RE, James-Stein type and positive shrinkage estimators as well as pre¬ 
test estimators. We also prove that the proposed shrinkage estimators (SEs) dominate in 
mean square error sense the UE. The technique in this paper extends, in two ways the 
method given in literature. 

First, the asymptotic dependance structure between the shrinking factor (i.e., the 
difference between the UE and the RE) and the RE is more general than that given 
in the quoted papers. In particular, the asymptotic variance of RE and the asymptotic 
variance of (UE — Re) are not positive definite matrices as in the problem studied in 
Judge and Mittelhammer [10]. This is justified by the fact that, since the hypothesized 
restriction is linear, these quantities are asymptotically equivalent to the nonsurjective 
linear (equivalent here to noninjective linear) transformations of the UE for which the 
asymptotic variance is positive definite matrix. In this case, it is impossible for the 
asymptotic variance of RE or that of (UE — Re) to be positive definite matrix. To make 
the justification more precise, let A be a nonrandom n x m-matrix with the rank no < n, 
let i? be a nonrandom n-column vector, and let F be n-column random vector whose 
variance is a positive definite matrix ’®'. Further, let G = AF + B, that is a nonsurjective 
linear transformation of the random vector F. Then, Var(G) = Af&A' which cannot be 
a positive definite matrix since rank(A^A') = n-o <n. 

Second, we derive a more general condition for the SEs to dominate the UE. To this 
end, we generalize Theorem 1 and Theorem 2 of Judge and Bock [9] which are useful 
in computing the bias and the risk functions of shrinkage-type estimators. As far as 
the underlying asymptotic results are concerned, another difference, with the work in 
Judge and Mittelhammer [10], consists in the fact that we derived the joint asymptotic 
normality under weaker conditions than that in the quoted paper. Indeed, in Judge 
and Mittelhammer [10], the covariance-variance of the error terms is a scalar matrix 
(see the first paragraph of Section 2 in Judge and Mittelhammer [10]) and thus, the 
errors term are both homoscedastic and uncorrelated. In addition, in the quoted paper, 
the regressors are assumed nonrandom. In this paper, the errors term do not need to 
be homoscedastic and/or uncorrelated, and they may also be nonstationary stochastic 
processes. Further, the regressors may be random and in addition, they may be correlated 
with the error terms. In summary, the proposed method is applicable to the statistical 
model with familiar regularity conditions as assumed in Judge and Mittelhammer [10], 
see the last sentence of Section 2.4, as well as in unfamiliar regularity conditions for which 
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the dependance structure of the errors and regressors terms is as weak as that of mixingale 
array. The model considered here takes also an account for the possibility of the change- 
points phenomenon and, because of this, the derivation of the joint asymptotic normality 
between the UE and RE is mathematically challenging. Moreover, the established results 
extend that given for example in Perron and Qu [16]. 

In concluding this introduction, note that due to the conditions discussed above which 
are weaker than that in the literature, the construction of shrinkage-type estimators 
cannot be obtained by applying the results given in the quoted papers. Further, the 
derivation of the asymptotic distributional risk (ADR) of shrinkage estimators (SEs) is 
challenging and the instrumental identities in Judge and Bock [9], Theorems 1 and 2, 
are not useful. This motivated us to generalize these identities. This constitutes one of 
the aspects of the main results which are significant in reflecting the difference with the 
quoted works. The second aspect, of the main results which is significant in reflecting the 
difference with the quoted works, can be viewed from the fact that the established ADR 
has some extra terms and the risk dominance condition of SEs looks quite complicated. 

The rest of this paper is organized as follows. Section 2 describes the statistical model 
and outlines the proposed estimation strategies. Section 3 gives the joint asymptotic 
normality of the unrestricted and restricted estimators. In Section 4, we introduce a class 
of shrinkage-type of estimators for the coefficients and derive its asymptotic distribution 
risks. Section 5 presents some simulation studies and an illustrative analysis of a real 
data set. Section 6 gives some concluding remarks and, for the convenience of the reader, 
technical proofs are given in the Appendix. 

2. Statistical model and assumptions 

In this section, we present the statistical model as well as the main regularity conditions. 
As mentioned above, in this paper, we focus on the model with change-points. Neverthe¬ 
less, the proposed method is useful in linear model without change-points. In this last 
case, the derivation of the joint asymptotic normality between the RE and UE is not as 
mathematically involved as in case of the model with change-points. 

2.1. The linear model without change-points 

We consider the multiple linear regression model with T observations for which the 
response is a T-column vector Y = (j/i,..., yrY, the regressors is a T x go-matrix Z, the 
regression coefficients is a go-column vector S, and the errors term is a T-column vector 
u. In particular, we have let 


( 2 . 1 ) 


Y = ZS + u. 


Further, we consider the scenario where a prior knowledge about <5 exists with some 
uncertainty. More specifically, we consider the case where S is suspected to satisfy the 
following restriction 


R6 = r, 


( 2 . 2 ) 
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where i? is a known k x go-matrix with rank fc < go, and r is a known /c-column vector. 
Under some regularities conditions on the error terms and the regressors, the shrinkage 
estimator for the parameter 5 is available in literature. To give some references, we quote 
Saleh [18], Hossain et al. [7] among others. The shrinkage estimators given in the quoted 
papers are members of the class of shrinkage estimators which is established in this paper. 
Further, the established condition for the risk dominance of shrinkage estimators is more 
general than that given for example, in Saleh [18], Hossain et al. [7]. 

The proposed methodology is applicable to the model in (2.1) and (2.2) provided 
that the conditions on the error and regressors terms are such that, as T tends to infinity, 

1. the matrices T~^Z^'Z^ and T~^{Z'^'uu'Z^) converge in probability to nonrandom 
go X go-positive and definite matrices; 

2. T~^l'^Z^'u converges in distribution to a Gaussian random vector whose variance- 
covariance is the limit in probability of T~^Z^'. 

These two points are generally satisfied in classical regression models where the error 
terms are homoscedastic and independent, with linearly independent regressors. In the 
sequel, we consider a very general model with change-points and heteroscedastic as well 
as possibly correlated errors term. The assumptions of the model are discussed in the 
next subsection. 

2.2. The model with change-points 

Briefly, we consider the multiple linear regression model with T observations and m 
unknown breaks points Ti,..., with 1 < Ti < • • • < Tm < T. Here, it is important to 
stress that the number of change-points m is known. For convenience, let Tg = 1 and 
Tm+i = T. Namely, let 

Y = Z6 + u, (2.3) 

where Y = {yi,... ^yx)' is a vector of T dependent variables, Z is a T x (m-I- l)g- 
matrix of regressors given hy Z = diag(Zi,..., Zm+i) with Zi = (zi,..., Zt^Y , and for j = 
2,3,... ,m-|-l, Zj = {zTj_i+i ,..., zTj )^ ZTi_i+i is a g-column vector for i = 1,2,...,rn-|-1. 
Here, u= (mi, .. ■ ^utY is the set of disturbances and <5 is the (m -|- l)g vector of coeffi¬ 
cients. Also, let i? be a known k x (m -I- l)g-matrix with rank k, k < (m -I- l)g and let r be 
a known fc-column vector. We consider the case where S may satisfy or not the following 
restrictions 

R6 = r. (2.4) 

Let {TY,...,T^} be the true values of the break times {Ti,... ,Tm}, and Z^ = 
diag(Z5>,...,Z[),+i), where = {zto_^+i, ..., Zto)'. Set <5 = ((5(, 5^,... ,(5(„+i)' where for 
* = 1,2,..., m -I- 1 6i is a g-column vector. 

To estimate the unknown parameters (<5(,... ... ,Tm+i)' based only on the 

sample information given in {F, Z}, one can use the least squares principle as described, 
for example in Perron and Qu [16]. Also, in case the restriction in (2.4) holds, it is com¬ 
mon to use the restricted least squares methods in order to estimate the target parameter. 
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This gives the restricted estimator (RE) of (<5, Ti,.. .,Tm)- In particular, concerning the 
change-points, let {Ti,... ,Tm} denote the RE of the true change points from restricted 
OLS and let {Ti,..., Tm} be the unrestricted estimators (UE). Also, let 6 and 6 be, re¬ 
spectively, the UE and RE for the regression coefficients S. Then, following the framework 
in Perron and Qu [16], let SSR^(Ti,..., T^) and SSR^(Ti,..., Tm) be the sum of square 
residuals from the RE and UE OLS regression evaluated at the partition {Ti,... ,Tm}, 
respectively. We have 


(Ti,... ,rm) = arg min SSRy (Ti,...,T,^), 
(fi,...,f„) = arg^miii SSR^(Ti,..., T^). 


(2.5) 


The optimality of the proposed method is based on the asymptotic properties of the UE 
and RE. In particular, in Section 3, we establish as a preliminary step the joint asymptotic 
normality of the UE and RE. To this end, we present below the regularities conditions. 
To simplify the notation, let the £2-norm of random matrix X be defined by ||Af||2 = 
and let * = 1,2,...} be a filtration. Also, let Op(a) denote a 
random quantity such that Op{a)/a converges in probability to 0, let Op (a) denote a 
random quantity such that Op(a)/a is bounded in probability. Similarly, let o(a) denote 
a nonrandom quantity such that o(a)/a converges to 0, let 0(a) denote a nonrandom 

quantity such that 0(a)/a is bounded. We also use the notations-and-to 

stand for convergence in distribution and convergence in probability respectively. 


Assumptions (Regularity conditions). 

(Ai) Let Lp = (r°+i - T°), p=l,...,m, then (1 /Lp)J2JLto^i^ ztz't ^ Qp(v) a 
nonrandom positive definite matrix uniformly in v £ [0,1]. Besides, there 
exists an Lq > 0 such that for all Lp > Lq, the minimum eigenvalues of 

i^/Lp) Et=To+i o/ i^/Lp) T,t. Zto-l bounded away from 0. 

(A 2 ) The matrix z:tz[ is invertible for 0 < *2 — U < SqT for some Eq > 0. 

(A3) Tp = [TAp], where p = 1,... ,m1 and 0 < < ■ ■■ < Xm< A^+i = 1- 

(A4) The minimization problem defined by (2.5) is taken over all possible partitions 
such that Ti — Ti_i > tT (f = 1,..., m -|- 1) for some r > 0. 

(As) For each segment, (Tp_i,Tp), p =1,... ,m-\-1, set Xpi=T~^Azj,o ^_^_iUrpo 

and set Tp^i = Xj-o We assume that {Xpi,iFp^i} forms a L^-mixingale array 
of size —1/2. That is, there exist nonnegative constants {cpi : i > 1} and 'f(j), 
j >0 such that tjj(j) j, 0 as j —>■ 00 and for i>l,j>0, with 

||E(Apj|Ap^i_j)||2 — Opif>(j), 

\\Xpi - E(Xp,\Fp^,+j)\\^ < CpiifO + 1), V'(j) = 0(j“^/^“^) 
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for some e > 0. Also, let Lp = T?, — T^, and define Ip, bp and Vp = [Lp/bp] such 


Jp - ^p J bp. Up 

that bp>lpFl, lp>\, bp < Lp. We assume that as b 


Up 


Ljy—¥00 


OO j hp 


->• 00 , 


bp jLp —> 0 ; CLTid lp j bp —>^ 0 . 

For p = 
grahle; 


{Ae) For p = 1,... ,m + 1, for s = 1,.. .,q, {XL^^/cF^i = 1,2,...} is uniformly inte- 


^ 2 

max Cp^=o{b~^^^); ( max CpA =0{b~^) 

l<i<Lp ^ y p / ^ V(i-l)6p + l<t<j6p ^ J ^ 


and 


ib„ 


ibr, 


X] ( X 

1 —l)6p-l-/p-|-1 / 


X 




pt 


\t —(i —l)6p+Zp + l 


X/tj —^oo 


>Sp. 


Moreover, let V},* = J2t={i-i)bi+h+'^^^’^’ J = 1; 2,..., m +1. Letr^^i) = mini<j<m(rpJ, 
let r(^rri) = maxi<j<m(rpJ, and let Lmin = mm(Li,.. .,Lm+i)- We have 

1- EX!i)+i(niax(,_i)6.+i<t<,b. Cjtf=o{bp), j = 1,2,..., m + 1. 


positive definite matrix. 


2 - 'Ei={{V(^i,Vl„ ..., C+i,*)'(Ki’ ^2.*: • • ■ > ^ where n is nonrandom 

X^min ^OO 


For the interpretation of Assumptions {Ai)-{A 4 ), we refer to Perron and Qu [16]. In 
summary, Assumptions (Ai) and {A 2 ) are usually imposed in multiple linear regressions 
with structural changes. Further, Assumption (A3) guarantees to have asymptotically 
distinct change points and Assumption (A4) puts a lower bound on the distance between 
breaks. As mentioned in Perron and Qu [16], this assumption is stronger than the similar 
condition literature. As justified in the quoted paper, this is the cost needed to allow the 
heterogeneity and serial correlation in the errors. Assumptions (A5)-(A6) are needed to 
establish the asymptotic normality of the UE. Note that Assumption (A5) considers the 
case of mixingale random variables, which allow both the regressors and the errors in 
each break to be a form of different distributions and asymptotically weak dependencies. 


3. The joint asymptotic distribution of the UE and 
RE 

In this section, we derive the asymptotic joint normality for the restricted and unre¬ 
stricted OLS. Under Assumptions (Ai)-(A4), converges in probability to a 

nonrandom q(m + 1) x q(m + l)-positive and definite matrix. Hereafter, we denote this 
matrix by P. Also, under Assumption (Ae), T~^{Z^'uu 'converges in probability to 
O, which is a nonrandom q{m+ 1) x q{m+ l)-positive and definite matrix. Further, under 
Assumptions (A5)-(A6), we establish the following lemma which is crucial in establishing 
the joint asymptotic of the UE and RE. 
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Lemma 3.1. Under Assumptions {Ai)-{Ae), T — - —;>A/(m+i)g(0, 

T —>-oo 


The proof is given in the Appendix B. Also, note that if the restriction in (2.4) does 
not hold, the asymptotic distribution of 5 may degenerate. Thus, in order to derive the 
joint asymptotic normality, we consider the following sequence of local alternative, 

HiT-R5 = r+^, T=l,2,..., (3.1) 


with ||/x|| < oo. To simplify the notation, let 5 and 5 denote, respectively, the UE and RE 
of 6. Let Jo = r~^i?'(i?r~^i?')“^, and let Im denote mx m identity matrix. Further, let 


Aii = -Jo/x, En = r-if]r-\ Ei2 = r-if]r-i (7(^+1),-i?'j'), 
^21 = ^22 = (7(m+l)g — >^0^)1" ^flT ^ (7(^+1)^ — i?'J q) , 

All = Jo7?Eii7?'Jo, Ai2 = Jo7?Ei2, A21=A'i2, A22 = E22- 


Lemma 3.2. Under Assumptions (jli)“(jl6), and the sequence of local alternative in 
(3.1), 


fVT{S-6°)\ d (e3\ 

\Vt{S-6°)J t^oo V£r4 7 

fVTC6-~6)\ d (e5\ 

\VT{S-S°) j T^oo ^£4 7 


~ M 2 


(m-t-l)q 



‘M, 


2{m+l)q 



Sii 

E 21 



Mil A 12 M 

\A 21 A 22 J J 


From the above result, it should be noted that (£ 5 ,£ 4 )', the limit in distribution 
{VT{S — 5),Vt{5 — (5°)) are not uncorrelated as for example in Saleh [18], Theorem 3, 
page 375, Hossain et al. [7], among others. Further, note that An and A 22 are not 
positive definite matrices as the case in Judge and Mittelhammer [10]. Because of that, 
the construction of shrinkage-type estimators cannot be obtained by applying the results 
given in the literature. 


4. Shrinkage estimator and related asymptotic 
properties 

It is well known that under the restriction in (2.4), the RE dominates in mean square error 
sense the UE. However, if the restriction in (2.4) is seriously violated, the RE performs 
poorly. In some scenarios, the prior restriction in (2.4) is subjected to some uncertainty 
that may be induced by the change in the phenomenon underlying the regression model 
in (2.3). Under such an uncertainty, it is of interest to propose a statistical method which 
combine in optimal way the sample information and an uncertain information given in 
(2.4). 
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In this section, we introduce a class of shrinkage estimators which encloses the UE, RE 
as well as Stein-type estimator, and positive part Stein-type estimator. To simplify some 
notations, let A = R'{RT~^VtT~^R')~^R, and A = R'{RV~^(lT~^R')~^R, where (l and 
r denote consistent estimators of and E, respectively. Also, as in Nkurunziza [14], let h 
be continuous (except on a number of finite points), real-valued and integrable function 
(with respect to the Gaussian measure). We consider the following class of estimators 

p{h) = 5 + h{T{S - SYA{S - S)){S - S). (4.1) 

It should be noted that for the case where h = 0, /3(0) is the RE S. Also, if h=l, we 
have the UE, that is, /3(1) = j. Further, by choosing a suitable h one can get the pretest 
estimators as given for example in Saleh [18], Hossain et al. [7], among others. Finally, 
the James-Stein estimator and Positive-Rule Stein estimator (5®“*' are members of the 
class in (4.1). Indeed, let k denote the rank of the matrix R as defined in (2.4). By taking 
h{x) = 1 — (fc — 2)/x, X > 0, and h{x) = max{0,1 — (fc — 2)/x}, x > 0 we get d® and <5®+, 
respectively. More precisely, we have d® = j-k (1 — ^^)((5 —d), (5®+ = d-k (1 — — 

where ip = T{S — SyA{6 — 6), with x’*' = max(0,x). 

In order to evaluate the performance of the proposed estimators, we consider the 
quadratic loss function L{6,d) = {d — 9yW{d— 0), where lU is a symmetric nonnegative 
dehnite matrix, and use the asymptotic distributional risk (ADR) as defined, for example, 
in Saleh [18]. For the convenience of the reader, we recall that the ADR of an estimator 6 
is defined as ADR(0,0; W) = E[pqWpq] , with pg the limit in distribution of \/T{9 — 9) 
as T tends to infinity, and W is a certain weight nonnegative definite matrix. 

In the sequel, we set A = p'^Api and assume that the weight matrix W satisfies W = 
with W* a symmetric nonnegative definite matrix. We establish below 
a lemma which gives the ADR of estimators which are members of the class in (4.1). 
Briefly, the derivation of this lemma is based on the identity, established in Appendix C, 
which generalizes Theorem 2 in Judge and Bock [9]. In particular, this lemma is useful 
in deriving ADR of S, S, d® and (5®+. 

Lemma 4.1. Suppose that Assumptions (Mi)-(M 6 ) and the sequence of local alternative 
in (3.1) hold. Then 

ADR(/3(/i),(5°,W) 

= ADR(J,dO, W) - 2E[/r(xI+2(A))]p;Wpi 

-2E[/i(x^+2(A))]p;AAi2Wpi +2E[/i(x2^2(A))]trace(Ai2WAnA) (4.2) 
+ 2 E[/i(x 2 + 4 (A))]plAAi 2 lUpi 

+ E[/i^(Xfc+ 2 (A))] trace(WAii) -k E[/i 2 (xfe+ 4 (A))]p'iWpi. 


Proof. The proof of this lemma follows directly by combining Lemma 3.2, Theorem C.2 
and Lemma C.3. □ 
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From Lemma 4.1, by taking h(x) = 1, h(x) = 0, h(x) = 1 — and h{x) = max{0, (1 — 
^^)}, we establish the following corollary which gives the ADR of the estimators 5, S, 
(5® and , respectively. 

Corollary 4.1. Suppose that the conditions of Lemma f.l hold, then 

ABR{S,S°,W) 

= trace(kFr“^Dr~^), 

ADR(5,^°,1F) 

= trace[VF(/,(^+i) - Joi?)r"^Or"^(/,(™+i) - R'Jq)]-\- p[Wni, 

ADR{S%S°,W) 

= ADR(,5, 6°, W) - 2{k - 2 )E[x^^2 (A)] trace(lF(Aii + A 12 )) 

+ - 4)R[xkU^MWf,, + {k- 2fE[x^^^iA)] trace(VFAn) 

+ 4{k - 2)E[x^4^(A)]^;AAi21F^i, 

ADR(,5®+,(5°,1F) 

(4-3) 

= ADR((5",,5°,1F) 

+ 2 E(/(x1 +2(A) <k-2)-{k- 2)xfe^2(A)/(xI+2(A) < k - 2))pi'^Wp, 

+ 2E(/(x'fc+2(A) <k-2)-{k- 2)xfc^2(A)/(Xfc+2(A) <k- 2))pi[AA,2Wp, 

- 2 E(/(x?+2 (A) <k-2)-{k- 2)xfe^2(A)/(Xfc+2(A) < k - 2))trace(VFAi2) 

- 2 E(/(x?+4 (A) <k-2)-{k- 2)xfe^4(A)J(xI+4(A) < k - 2))pi{AA,2W pi, 

- E(/(xt2(A) <k-2)-2{k- 2 )xIUA)I{xI+ 2{^) <k-2) 

+ (fc - 2 )^Xfc+ 2 (A)/(Xfe+ 2 (A) <k-2)) trace(lEAii) 

- E(/(xt4(A) <k-2)-2{k- 2)xfe^4(A)/(Xfc+4(A) <k-2) 

+ ik- 2fx-^^{A)Iixl+,iA) < k - 2))p[Wp,. 

It should be noted that the expressions in Corollary 4.1 are more general than that, 
for example, in Saleh [18], page 377, and Hossain et al. [7] for which A 12 = 0. 

From Corollary 4.1, we establish the following corollary which shows that shrinkage 
estimators dominate the UE. It is noticed that, due to the asymptotic dependance struc¬ 
ture between the shrinking factor and the restricted estimator, the above dominance 
condition looks quite complicated. To simplify the notation, let Chniax(n) denote the 
largest eigenvalue of 11, and let Chi„in(n) denote the smallest eigenvalue of 11. Further, 
let Ro = Ai/2(Aii + 4Ai2/(fc + 2 ))VFAiiAi/ 2 ^ n* = (Rq + R[,)/2. 
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Corollary 4.2. Suppose that Assumptions (yli)-(.A 6 ) hold, and let W be nonnega¬ 
tive definite matrix such that trace(iyAi 2 ) < 0, — Chmin {WAll) < Chniin(Vl^Ai 2 ) and 
trace(lF(Aii + A 12 )) > max(—trace(lCAi 2 ), {k + 2) Climax (n*)/4). Then, 

ADR(,5®+,5°,1C) < ADR(J^5°,1C)<ADR(5,(5°,VF), for all A>0. (4.4) 

Remark f.l. It should be noted that the conditions for the shrinkage estimators to 
dominate the unrestricted estimator are more general than given for example in Hossain 
et al. [7], Corollary 4.2, Saleh [18], pages 358, 360, 382, the relations (7.4.8), (7.4.31) and 
(7.8.35). 

Indeed, in the quoted work, we have A 12 = 0. In this special case, the above condition 
can be rewritten as > ^} and this set contains {W : cCipAii) ^ 

which given in the above quoted works. 

5. Illustrative data set and numerical evaluation 

5.1. Simulation study 

In this section, we present some Monte Carlo simulation results to evaluate the per¬ 
formances of the proposed estimators. This is done by comparing the relative mean 
square efficiencies (RMSE) of the estimators with respect to the UE, 6 . Recall that 
RMSE(5*) = risk((5)/risk(5*), where i5* is the proposed estimator. Note that, a relative 
efficiency greater than one indicates the degree of superiority of the proposed estimator 
over S. To save the space of this paper, we report only two cases. 

Case 1: the number of unknown parameters is small, with m = 3, q = 2; = 

( 5 ° ,62 ,<53 ,^4 )' with ( 5 ° = Jg = ( 1 , 2 )' and = ^4 = 0 (i-®-; zero vector), and the 
sample sizes are set to be T = 40 with the change points given by (10,20,30,40). Also, 
we set T = 100 with the change-points (25,50,75,100). Further, the restriction is such 
that R= [El,i? 2 ,Eg,E 4 , —El, —E 2 ,Eg,Eg] where, for j = 1,2,...,6, Ej is a 6 -column 
vector with all components equal to zero except the jth component which equal to 1 . 

Case 2: the number of unknown parameters is relative large by setting to = 4, 
q = 5, = (S 4 ,62 ,64 )' with ^5* = <^3 = 1^5 = (1, 2 ,3,4,5)', 62 = 64=0 and the 

sample sizes are T = 100 and T = 500 with the change-points (20,40,60,80,100) and 
(100,200,300,400,500), respectively. Further, the restriction R is set to be a 8 x 25 ma¬ 
trix with 


El,l — R2,2 — R3,3 — -^4,4 — Rd.s — ^6,6 — ^7,19 — ^8,20 — Ij 
El,11 = ^2,12 = Es.lS = E4 ,i 4 = Eg,15 = —1, 
and the rest elements of E are set to be 0. 

In each case, we let Zt, ~A/'q(l,E), where S is a g x g symmetric matrix such that 
Fa,6 = |0.5|I““^L Also, we let Ui ^ Af(0,a^), 1 < cr^ < 2, and compute the related RMSE 
based on the 1000 replications. 
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(a): T=40, m=3, q=2 (b): T=100, m=3, q=2 



Figure 1. RMSE of the restricted and shrinkage estimators (case 1). 


The results of the simulation studies are given in Figures 1 and 2. In summary, the 
results corroborate the theoretical finding (given in Corollary 4.2) for which the proposed 
shrinkage estimators dominate the unrestricted estimator. We also construct, and present 
in Appendix C, Figures 3-6 which give some histograms of the UE and RE of the change 
points. The results given in Figures 3-6 suggest that both the unrestricted and the 
restricted methods work well in estimating the change points. 

(a): T=100, m=4, q=5 (b): T=500, m=4, q=5 




Figure 2. RMSE of the restricted and shrinkage estimators (case 2). 
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Case 1: T=40 with unrestricted method 


9 10 11 12 

Change point (True value: 10) 


Case 1: T=40 with unrestricted method 
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Change point (True value: 30) 


Case 1: T=40 with restricted method 
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Change point (True value: 20) 


Case 1: T=40 with unrestricted method 
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Change point (True value: 20) 

Case 1: T=40 with restricted method 
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Change point (True value: 10) 

Case 1: T=40 with restricted method 
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Change point (True value: 30) 


Figure 3. Histograms of the UE and RE of change points (case 1 with T = 40). 
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Case 1: T=100 with unrestricted method 


Case 1: T=100 with unrestricted method 
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Change point (True value: 25) 

Case 1: T=100 with unrestricted method 


Change point (True value: 50) 

Case 1: T=100 with restricted method 
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Change point (True value: 75) 

Case 1: T=100 with restricted method 


Change point (True value: 25) 

Case 1: T=100 with restricted method 
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Change point (True value: 75) 


Figure 4. Histograms of the UE and RE of change points (case 1 with T = 100). 
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Case 2: T=100 with unrestricted method 


Case 2: T=100 with unrestricted method 


7 18 19 20 

Change point (True value: 20) 

Case 2: T=100 with unrestricted method 
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Change point (True value: 60) 

Case 2: T=100 with restricted method 
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Change point (True value: 40) 

Case 2: T=100 with unrestricted method 
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Change point (True value: 80) 

Case 2: T=100 with restricted method 
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Change point (True value: 20) 

Case 2: T=100 with restricted method 
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Change point (True value: 60) 
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Change point (True value: 40) 

Case 2: T=100 with restricted method 
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Change point (True value: 80) 


Figure 5. Histograms of the UE and RE of change points (case 2 with T = 100). 
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Case 2:1=500 with unrestricted method 


Case 2: T=500 with unrestricted method 


98 99 100 101 

Change point (True value: 100) 

Case 2:1=500 with unrestricted method 


199.0 199.5 200.0 200.5 

Change point (True value: 200) 

Case 2: T=500 with unrestricted method 


298.0 298.5 299.0 299.5 300.0 300.5 301.0 

Change point (True value: 300) 

Case 2; T=500 with restricted method 


I 399.5 400.0 400.5 

Change point (True value: 400) 

Case 2: T=500 with restricted method 


99 100 101 

Change point (True value: 100) 

Case 2; T=500 with restricted method 


199.0 199.5 200.0 200.5 201.0 201.5 202.0 

Change point (True value: 200) 

Case 2: T=500 with restricted method 


298.0 298.5 299.0 299.5 300.0 300.5 301.0 

Change point (True value: 300) 


399.5 400.0 400.5 

Change point (True value: 400) 


Figure 6. Histograms of the UE and RE of change points (case 2 with T = 500). 
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5.2. Data analysis 


In this subsection, we illustrate the application of the proposed estimation strategy to 
the real data set. As a real data set, we consider a historical (log) gross domestic product 
(GDP) data set from 1870 to 1986 for 10 different countries. This data set is used for 
example in Perron and Yabu [17], and these authors pointed out that most GDP series 
presented in the given data set are characterized by at least one major shift and therefore 
change-point model is applicable. For each GDP series, we consider the following model: 

f 5i(i,t,<i if t = l,...,Ti, 

‘ , ift = ri + l,...,117, 


with 1 < Ti < 117, for i = 1,2, 5i is a 4-column vector. The uncertain restriction is given 
by R5 = r with 

'0 0 1 0 0 0 0 o' 

00010000 
■^“ 00000010 ’ 

0 0 0 0 0 0 0 1 


and r = {0,0,0,0}'. In practice, the hypothesized restriction means that the log(GDP) 
is suspected to have a linear trend. For the given data, we first use the proposed method 
to calculate the unrestricted and the restricted estimators of the change-point Ti and Ti 
as well as the estimators 5, S, and <1®“''. For the change-point Ti which is a nuisance 
parameter here, we do not compute the shrinkage estimators. The obtained unrestricted 
and restricted estimate of the change-point Ti and Ti are given in Table 1. In order to 
save the space of this paper, we do not report here the point estimates of 6 ,6, 6^ , S^~^, 
but these values are available upon request. Further, we calculate the MSE of each type 
of estimators, by applying the bootstrap method to the residuals. Recall that, in this 


Table 1. Change-points and MSE 


Country 

Change-points 

MSE 




(UE) 

(RE) 

5 

5 

<5" 

<5“+ 

Australia 

1907 

1929 

1.67004021 

0.03936242 

1.64839567 

1.64839567 

Canada 

1931 

1930 

2.96623326 

0.05474518 

2.87279365 

2.87279365 

Denmark 

1939 

1939 

3.99038175 

0.04765026 

3.93532691 

3.93532691 

France 

1943 

1943 

12.1123258 

0.1253509 

11.9030741 

11.9030741 

Germany 

1945 

1954 

11.4218637 

0.1704905 

11.3279191 

11.3279191 

Italy 

1943 

1943 

10.2462836 

0.1211837 

10.2079175 

10.2079175 

Norway 

1944 

1948 

7.09593981 

0.03606614 

6.92396377 

6.92396377 

Sweden 

1924 

1916 

0.72605495 

0.02192452 

0.70854206 

0.70854206 

U.K. 

1918 

1919 

0.61037392 

0.01496282 

0.58916536 

0.57701458 

u.s. 

1940 

1929 

3.97869572 

0.05967521 

3.91443168 

3.91443168 
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paper, the change-points are treated as the nuisance parameters. Thus, the construction 
of the shrinkage estimators for the change-points is beyond the scope of this paper. 

As we can see from Table 1, the MSE of the restricted estimator is much smaller 
the MSE of the other estimators. This may indicate that the true value of the parameter 
vector lies in the neighborhood of the chosen restriction. Further, the MSE of the proposed 
shrinkage estimators is smaller than the MSE of the unrestricted estimator. The obtained 
result is in agreement with the above simulation study. 


6. Conclusion 

The goal of this research was to derive an improved estimation strategy for the regression 
coefficients in multiple linear model with unknown change-points under uncertain restric¬ 
tions. In summary, we introduced a class of estimators which includes the UE 8 , RE 8 , 
James-Stein Estimator i5® and Positive-Rule Stein Estimator The main difficulty 
consists in the fact that the random quantities 8 — 8 and 8 — 8 are not asymptotically 
uncorrelated as this is the common case in literature. To tackle this difficulty, we gener¬ 
alized (in the Appendix C) Theorems 1-2 in Judge and Bock [9]. Under the conditions 
more general than that in literature, we established that J'* and dominate UE. The 
performance of SEs over the UE is confirmed by the simulation studies. They also show 
that SEs perform better than the RE when one moves far away from the hypothesized 
restriction. It should be noticed that, in this paper, the tools used for studying shrinkage 
estimators are based on noncentral chi-squares. One of the referees suggested to investi¬ 
gate if the obtained results can be improved by using more recent tools such as Stein’s 
unbiased risk estimate. Research on this interesting idea is ongoing. 

Another highlight of this paper consists in the fact that, in deriving the joint asymptotic 
normality of the UE and RE, we relax some conditions given in recent literature. In 
particular, we considered here the condition of L 2 -niixingale with size —1/2, which allow 
both the regressors and the errors in each break to be a form of different distributions 
and asymptotically weak dependencies. 


Appendix 

In this section, we give some technical proofs underlying the results established in this 
paper. To set up additional notations, let ||A|| denote the Euclidean norm for vector A. 
For a matrix B, let ||i?|| be the vector induced norm (i.e., ||i?|| = sup^. ^o\\Bx\\/\\x\\). 

Appendix A: Technical results underlying the 
asymptotic properties 

First, we establish the following proposition which plays a central role in deriving the 
joint asymptotic normality between the UE and RE. For the sake of simplicity, we set 
Di^k* = Xpi — ¥i{Xpi\iFp^i+k*) and set Di^k*,s be the sth element in 
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Proposition A.l. Suppose that Assumptions {A 5 ) and (Ae) hold. Then, 

( Lp \ Lp Lp 

r—1 / i—1 i—1 

Lp i— 1 

EE E ^ h)J 2 * s Tti^g'^(^Dj ^ ^ 

i=l 3=1 

and 


Y,[mHXpi,s\Ap,,+k^)) - E(E2(Api,,|J-p.,+fc._i))] = - E(D2,._J]. 

i^l i^l 

Proof, One can verify that 

CXD 

^pi— ^ ^ p,'i+fc*) p,z+fc* —1)] ^-S- 

k* — — oc) 

Further, one can verify that 

Lp Lp 

— p,i+fc*)] + 

i=l i^l 

Lp 

— 2^E[E(Api_s| Jp_j+fc._i)E(E(Xpi^s|-7>,z+fe*)l-^P.i+fc*-i)]i 


and then, by using the properties of the conditional expected value, we prove the first 
statement. For the second statement, we have 

Lp 2—1 

EE Di^k’’,s){Dj^k*-i,s Dj^k*,s))] 

i=i 3=1 

Lp i — 1 

=EE ^[{^j,k’‘—i,s Dj^k’‘,s)(F{{Di^k’‘—i,s Di^k* ,s)\Apj+k*))] —0. 

i=i 3=1 


The third statement of the proposition follows from the similar algebraic computations. □ 
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Lemma A.l. Let v\ = Ojnd suppose that Assumptions (A 5 ) and {Aq) hold. 

Then 


Ee 


max 



< IGv? 

- Ti-n 


E E^ "(*) 


- 1 / 2 - 


. k *—0 \ i =0 


The proof follows from Proposition A.l and following the similar steps as in proof of 
Lemma 3.2 in Mcleish [12]. By using this lemma, one establishes the following corollary 
which plays a central role in establishing the joint asymptotic normality of UE and RE. 


Corollary A.l. Under Assumptions (A 5 ) and {Ae), then 


Ee 

S=1 


^ ^ Alpi ^ s 


= 0{vl). 


Proof. Erom Lemma A.l, 


EE(^maxgA,,,j )< 


- i 6 wi 


00 / k 


- 1 / 2-1 2 


E "w 


, k *—0 \ 2=0 


(A.l) 


and then, the proof follows directly from the fact that ^(0) < 00 . □ 


Corollary A. 2 . Let vf = +; +i^pt suppose that Assumptions (A 5 ) and 

(Ae) hold. Then, = I, ■ ■ ■ ,rp,rp> 1} is uni¬ 
formly integrable. In particular, = ^^ ■ ■ ■ UpUp> 

1 } is uniformly integrable. 


Proof. Let Sj^s = J2i=i^pi,sj s = 1,... ,q. By using the same arguments as used in proof 
of Lemma 3.5 in McLeish [12], one verifies that the set Lp > 1} is 

uniformly integrable. This completes the proof. □ 


Eurther, by using Lemma A.l, we establish the following proposition which is also 
useful in establishing the joint asymptotic normality of UE and RE. To simplify some 
notations, let Cmin = niini<p<m+i (cp), and let Lmin = niini<p<m+i(Lp). Eurther, let 
Hi be the cr-field generated by {Uw^jUib^-i,...}, with Ui are random variables de¬ 
fined on iri,F,P) such that Hi-i C Tpi_j, and let Vpi = Yl't={i-i)b - 1 -/ +i^pt^ 
Wp^i = E(Upil'Hi) - EiVpilHi-i), p = 1,2 ,... ,m -I- 1, j = 1,2,.. .,rnun- 
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Proposition A.2. Suppose that Assumptions (As) and (Ae) hold. Then, 

min 

i=l 

- {wi „..., ..., 0 . 


The proof follows from Lemma A.l along with some algebraic computations. 
Proposition A.3. Suppose that the conditions Proposition A.2 hold. Then, 


' min 


•^min ^OO 

2=1 


and 


’Ti+l 'T'a Q 

EEEe 

0 = 12 = 1 S—1 


iWa,r,sm VLL' 


/ ^ ' 0,2,5 

S =1 


> £ 


-^ 0 , 


for all e > 0. 


Proof. By using Assumption (As) along with Proposition A.2 and Slutsky’s theorem, 
we establish the first statement. For the second statement, one verifies that, for each 
a = 1,2,..., m + 1, {Wa^iyTLi} is a L 2 -mixingale array of size —1/2. Then the rest of the 
proof follows from Corollary A.2. □ 


Appendix B: Asymptotic normality of the UE and 
RE 


Proof of Lemma 3.1. Note that 


/Li i™+i \' 


then 

^min / n_ ibl rm + 1 i bm + 1 

r-v2z%=|iw,+H*+ E E E..---E E 

2 = 1 \2=2'min t = (2 — l)6i-l-l 2=2’min t — — l)67n4-l“l“l 


with r-min =mini<i<m+i(r-i) and H* = (Sf ,S ^',..., , where 

I T/ _ ^A/ 4- 1 -h 


-^m+ 1,2 1 5 


(B.l) 


nin / * \ 

2=1 \ t—(i—l)bj+l / t—Tjbj+l 
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Further, it should be noted that, under Assumptions {Aa) and (As), T tends to infinity 
if and only if Lmin = inini<j<m+i(Aj) tends to infinity. 

By using Lemma A.l along with some algebraic computations, we have 





-^0, 


^ ri ibi 

E E 

\i — 'Fmin t —(z — l)6i-|-l 


n + l 


jbm+1 


Y' 


E E 

j —"^111111 t — — + 




-^0. 


(B.2) 


Therefore, the proof follows from the relations (B.l) and (B.2) along with the martingale 
difference sequence central limit theorem along with Slutsky’s theorem. □ 


Proposition B.l. Under (Ai)-(A6), we have Vt{S — <5°) —-— >ei ^ A/'„(m+i)(0, 

T-s-oo ' ' 


The proof follows by combining Lemma 3.1 and Slutsky’s theorem. 

Proof of Proposition 3.2. Let J = {Z^'R'{R{Z^'R')~^ , we have 

(VTis - 6°y, vt{5 - syy = (i(m+i)g, /(m+i), - R'j'yvT(5 - < 5 °) + (o, -m'jo'. 

Then, the first statement follows directly from Proposition B.l and Slutsky’s theorem, 
along with some algebraic computations. For the second statement, obviously 

((<5 - ^)', (^~ - <5°)')' = ((/,(„+!), 0)', (-4(^+1), Wi))0'((-5 - <5°)', (^ - 5^yy. 


Then, the rest of the proof follows directly from the first statement of the proposition 
along with Slutsky’s theorem. □ 


Appendix C: Some results for the derivation of risk 
functions 

Theorem C.l. Let h be Borel measurable and real-valued integrable function, let X ~ 
Mp{p,X), where Y, is a nonnegative definite matrix with rank k<p. Let A be a p x p- 
nonnegative definite matrix with rank k such that YA is an idempotent matrix, AYA = A; 
YAY = Y; and YAp, = p,, and let W = A^A\y*A^A where W* is a nonnegative definite 
matrix. Then, E[h(A'AA)LFA] = E[h(x|_,_ 2 (/i'A^))]LF/i. 

Proof. Let A^At be the Moore-Penrose pseudoinverse of A^A^ gy tbe definition of 
Moore-Penrose pseudo-inverse, we have WX = 
and then. 


Elh(X'AX)X'WX] = Elh(X'AX)X'A^''^A^''^^WA^''^^A^''^X]. 


(C.l) 
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Further, since is a symmetric and idempotent matrix, there exists an orthog¬ 
onal matrix G such that = ([/fc,0]:[0,0])'. Define V = GA^^'^X. Then, 

F[h{X'AX)WA^/'^'^A^/^X] = E[h(y(Vi)WA^/'^^G'[Ik,0]'Vi] with Vi = [Ik,0]GA^/^V, 
and then, the rest of the proof follows from Theorem 1 in Judge and Bock [9] along 
with some algebraic computations. □ 

Remark C.l. For the special case where E is the p-dimensional matrix Ip, Theorem C.l 
gives Theorem 1 in Judge and Bock [9] with A = W* = Ip. This shows that the provided 
theorem generalizes the quoted classical result. 

By using Theorem C.l, we establish the following corollary. 

Corollary C.l. Set fi 2 = —Mi and let be as defined in Lemma 3.2. Let h be a Borel 
measurable and real-valued integrable funetion, let W = A^/'^W*A^/'^, W* is a nonnega¬ 
tive definite matrix. Then, we have E[h(£ 5 ^£ 5 )VF£ 5 ] = E[/i(x^_,_ 2 (M 2 ^M 2 ))]bFM 2 - 

Theorem C.2. Let Di =trace(kFE), D 2 = n'W^ and assume the conditions of Theo¬ 
rem C.l hold. Then, F[h{X'AX)X'W X]=Fj[h{'x^^ 2 {T'■^t))\DiA gL))\D 2 . 

Proof. By using the same transformation methods as in the proof of Theorem C.l, we 
have 


F[h{X'AX)X'WX]=F[h{ViVi)V{[h,Q]GA^/^^WA^/'^'^G'[h,A\'Vi]. 

Therefore, the proof is completed by combining Theorem 2 in Judge and Bock [9] along 
with some algebraic computations. □ 

Remark C.2. Note that Theorem C.2 generalizes Theorem 2 in Judge and Bock [9]. 
Indeed, if E = Ip, the quoted result is obtained by taking A = Ip. 

By using Theorem C.2, we establish the following corollary. 

Corollary C.2. Let Di = trace(lFAii), D2 = ^2Wg,2 and suppose that the con¬ 
ditions of Corollary C.l hold. Then, E[/i(£gA£5)£5lF£5] = E[h{x'i^2iT2^T2))]Di -\- 
F[hixi+4iT2^T2))]D2. 

Proof. This corollary directly follows from Theorem C.2. □ 

Theorem C.3. Let 



where the rank o/En is k<P, with pLy = —pi-x, A'EiiA = A; EnAEn = En; Eh^mx = 
Mx- Further, we assume that W = A^I'^W*A^!"^ , where W* is a nonnegative definite 
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matrix. Then, 

E[h{X'AX)Y'WX] 

= —E[h{xk+2{T'x^Tx))]ti-'x^Tx — ^[h{Xk+2iT'x^Tx))]p-'x^'^i2Wp,x 

+ ^HXk+2{T'x^Tx))] trace(Ai2M^AiiA) + E[h{xl+4{T'x^Tx))]tJ‘'xA^i2W^ix. 

Proof. Using the similar transformation methods as in proof of Theorem C.l, we have 

E[h{X'AX)Y'WX]=E[h{VlVi)E[Y\ViyWA^/^'<G'[Ik,0]'Vi], 

where E[F|Vi] = —p,x + E 2 iA^/^G'[/fe,0]'(Ui — py). Further, from Theorem C.l, 

E[h(U/Vi)M^VFAi/2tG'[/fc,0]Vi] = E[h{xl+2(.TxATxWxWpx 

and 

E[h(U/Ui)/r;[4,0]GAi/2Si2lUAi/2tG'[7fc,0]'Ui] 

= E[h{xl+2{T'xApxWxAE42WA^/^^px, 

and the proof is completed by some algebraic computations. □ 

By using this theorem, we establish the following corollary. 

Corollary C.3. With £5 and £4 defined in Lemma 3.2, and let p 2 = —pi- Then, we have 
E[h{e'^Ae5)e'4W£5] 

= -Hhixl+2iT2^T2))]P2WAiiAp2 - E[h{xl+2iT2^T2))]lJ-2^Ai2WAiiAp2 

+ E[h{xl+2iT2^T2))] trace(Ai2lUAiiA) 

+ E[h(xfe+4(/i2A/r2))]/i2AAi2VFAiiA/x2. 


Proof of Corollary 4.2. By some algebraic computations, we have, 

ADR((5", 5°, IT) - ADR((5, <5°, W) 

= _(fc _ 2)2 trace(lU(An + 2 Ai 2 ))E[xfe^ 2 (A)] 

- (ft - 2)(4AGi - (ft + 2)G2 )E[x^4^(A)], 

where Gi = trace(lU(Aii + A12)), C2 = p'iA{Aii + 4Ai2/(fc + 2 ))Wpi, and G3 = 
trace(lUAii). Then, since ft > 2 , ADR(( 5 '*, 5 °, IF) < ADR( 5 , provided that 

trace(lF(Aii + 2A12)) > 0 and 4 AGi — (ft + 2)G2 > 0 . Note that if G2 = 0 , 4 AGi — 
(ft + 2)G2 > 0 holds for any A > 0 , and if C2 > 0 , 4 AGi — (ft + 2)G2 > 0 holds for 
AGi > (ft + 2)G2/4, which is equivalent to Gi > (ft + 2)G2/(4A). 
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Since C 2 = + 4Ai2/(fc + 2))W and by Courant’s theorem, we have 


Ch„,i„(n*) < 


/r'iA(Aii + 4 A 12 /{k + 2))W KiiA^i 


< Ch„,ax(n*), 


where U * = (Hq + n[,)/ 2 , Ho = ^^/^(Aii + 4 Ai2/(fc + 2 ))WAiiA^/^ and Ch„,i„(n*), 
Chinax(n*) are denoted as the smallest and largest eigenvalue of If*, respectively. 
Then, 4ACi — (fc + 2)C2 > 0 holds if Ci > (fc + 2) Chmax(n*)/4. In addition, since 
trace(lT(Aii + 2 A 12 )) > 0 is equivalent to Ci > — trace(ITAi 2 ), it follows that 


ADR((5", 5°, IT) < ADR(5, <5°, W) 


if trace(IT(Aii +A 12 )) > max(—trace(ITAi 2 ), (fc + 2) Chmax(n*)/4). Further, by similar 
algebraic computations, we prove that ADR((5'*+,5°,IT) < ADR((5'’,(5°,IT), this com¬ 
pletes the proof. □ 


Acknowledgements 

The authors would like to acknowledge the financial support received from Natural Sci¬ 
ences and Engineering Research Council of Canada. Further, the authors would like to 
thank anonymous referees for useful comments and suggestions. 


References 

[ 1 ] Bai, J. and Perron, P. (2003). Computation and analysis of multiple structural change 

models. J. Appl. Econometr. 18 1-22. 

[2] Baranchick, A. (1964). Multiple regression and estimation of the mean of a multivariate 

normal distribution. Technical Report No. 51, Dept. Statistics, Stanford Univ. 

[3] Braun, J.V. and Muller, H.G. (1998). Statistical methods for DNA sequence segmenta¬ 

tion. Statist. Sci. 13 142-162. 

[4] Broemeling, L.D. and Tsurumi, H. (1987). Econometrics and Structural Change. Statis¬ 

tics: Textbooks and Monographs 74. New York: Dekker, Inc. MR0922263 

[5] Fu, Y.-X. and CuRNOW, R.N. (1990). Locating a changed segment in a sequence of 

Bernoulli variables. Biometrika 77 295-304. MR1064801 

[6] Fu, Y.-X. and CuRNOW, R.N. (1990). Maximum likelihood estimation of multiple change 

points. Biometrika 77 563-573. MR1087847 

[7] Hossain, S., Doksum, K.A. and Ahmed, S.E. (2009). Positive shrinkage, improved pretest 

and absolute penalty estimators in partially linear models. Linear Algebra Appl. 430 
2749-2761. MR2509855 

[8] James, W. and Stein, C. (1961). Estimation with quadratic loss. In Proc. 4.th Berkeley 

Sympos. Math. Statist, and Prob. I 361-379. Berkeley, CA: Univ. California Press. 
MR0133191 

[9] Judge, G.G. and Bock, M.E. (1978). The Statistical Implications of Pre-Test and Stein- 

Rule Estimators in Econometrics. Amsterdam: North-Holland. MR0483199 



Stein rules in linear models with change-points 


25 


[10] Judge, G.G. and Mittelhammer, R.C. (2004). A semiparametric basis for combin¬ 

ing estimation problems under quadratic loss. J. Amer. Statist. Assoc. 99 479-487. 
MR2062833 

[11] Lombard, F. (1986). The change-point problem for angular data: A nonparametric ap¬ 

proach. Technometrics 28 391-397. 

[12] McLeish, D.L. (1977). On the invariance principle for nonstationary mixingales. Ann. 

Probab. 5 616-621. MR0445583 

[13] Nkurunziza, S. (2011). Shrinkage strategy in stratihed random sample subject to mea¬ 

surement error. Statist. Probab. Lett. 81 317-325. MR2764300 

[14] Nkurunziza, S. (2012). The risk of pretest and shrinkage estimators. Statistics 46 305-312. 

MR2929155 

[15] Nkurunziza, S. and Ahmed, S.E. (2010). Shrinkage drift parameter estimation for multi¬ 

factor Ornstein-Uhlenbeck processes. Appl. Stock. Models Bus. Ind. 26 103-124. 
MR2722886 

[16] Perron, P. and Qu, Z. (2006). Estimating restricted structural change models. J. Econo¬ 

metrics 134 373-399. MR2328414 

[17] Perron, P. and Yabu, T. (2009). Testing for shifts in trend with an integrated or station¬ 

ary noise component. J. Bus. Econom. Statist. 27 369-396. MR2554242 

[18] Saleh, A.K.Md.E. (2006). Theory of Preliminary Test and Stein-Type Estimation with Ap¬ 

plications. Wiley Series in Probability and Statistics. Hoboken, NJ: Wiley. MR2218139 

[19] Tan, Z. (2014). Improved minimax estimation of a multivariate normal mean under het- 

eroscedasticity. Bernoulli. To appear. 

[20] Zeileis, a., Kleiber, C., Kramer, W. and Hornik, K. (2003). Testing and dating of 

structural changes in practice. Comput. Statist. Data Anal. 44 109-123. MR2019790 

Received October 2013 and revised May 2014 


